Nous Research Unveils NousCoder-14B: Competitive AI Model
NousCoder-14B shows high accuracy in competitive programming evaluations.
Records found: 9
NousCoder-14B shows high accuracy in competitive programming evaluations.
Discover how Agentic Memory optimizes memory management in LLM agents.
MAI-UI surpasses competitors in mobile GUI tasks with cutting-edge integration.
Explore how LFM2-2.6B-Exp enhances model performance with reinforcement learning.
Discover NVIDIA's Orchestrator-8B, enhancing tool selection using reinforcement learning.
Sakana AI introduces Reinforcement-Learned Teachers (RLTs), a novel method that trains smaller models to teach reasoning to large language models efficiently using reinforcement learning focused on generating step-by-step explanations.
Microsoft and Tsinghua researchers propose Reward Reasoning Models that adaptively allocate compute resources during evaluation, significantly improving large language model judgment and alignment across complex tasks.
Tsinghua University researchers developed the Absolute Zero paradigm to train large language models without external data, using a self-evolving code executor system to enhance AI reasoning and learning.
Researchers from Tsinghua University and Shanghai AI Lab introduce TTRL, a novel method allowing large language models to improve their performance without labeled data by leveraging self-generated pseudo-rewards during inference.